Towards Learning Rules from Natural Texts
نویسندگان
چکیده
In this paper, we consider the problem of inductively learning rules from specific facts extracted from texts. This problem is challenging due to two reasons. First, natural texts are radically incomplete since there are always too many facts to mention. Second, natural texts are systematically biased towards novelty and surprise, which presents an unrepresentative sample to the learner. Our solutions to these two problems are based on building a generative observation model of what is mentioned and what is extracted given what is true. We first present a Multiple-predicate Bootstrapping approach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules. Second, we present an iterative ensemble colearning approach, where multiple decisiontrees are learned from bootstrap samples of the incomplete training data, and facts are imputed based on weighted majority.
منابع مشابه
Learning Rules from Incomplete Examples: A Pragmatic Approach
In this paper, we consider the problem of inductively learning rules from specific facts extracted from texts. This problem is challenging due to two reasons. First, natural texts are radically incomplete since there are always too many facts to mention. Second, natural texts are systematically biased towards novelty and surprise, which presents an unrepresentative sample to the learner. Our so...
متن کاملStructuring Natural Language Data by Learning Rewriting Rules
The discovery of relationships between concepts is a crucial point in ontology learning (OL). In most cases, OL is achieved from a collection of domain-specific texts, describing the concepts of the domain and their relationships. A natural way to represent the description associated to a particular text is to use a structured term (or tree). We present a method for learning transformation rule...
متن کاملInverting Grice's Maxims to Learn Rules from Natural Language Extractions
We consider the problem of learning rules from natural language text sources. These sources, such as news articles and web texts, are created by a writer to communicate information to a reader, where the writer and reader share substantial domain knowledge. Consequently, the texts tend to be concise and mention the minimum information necessary for the reader to draw the correct conclusions. We...
متن کاملMention Model for Learning Rules from Incomplete Examples
Introduction. We are motivated by the problem of learning rules from naturally available data sources such as natural language texts, web pages, and medical databases. At first, learning rules from natural sources like the web seems to consist of extracting specific facts followed by data mining of rules. Unfortunately, however, there are two major obstacles to fully realizing the dream of unli...
متن کاملLearning Rules from Incomplete Examples via a Probabilistic Mention Model
We consider the problem of learning rules from natural language text sources. These sources, such as news articles, journal articles, and web texts, are created by a writer to communicate information to a reader, where the writer and reader share substantial domain knowledge. Consequently, the texts tend to be concise and mention the minimum information necessary for the reader to draw the corr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010